Knowledge graphs, modeling multi-relational data, improve numerous applications such as question answering or graph logical reasoning. Many graph neural networks for such data emerged recently, often outperforming shallow architectures. However, the design of such multi-relational graph neural networks is ad-hoc, driven mainly by intuition and empirical insights. Up to now, their expressivity, their relation to each other, and their (practical) learning performance is poorly understood. Here, we initiate the study of deriving a more principled understanding of multi-relational graph neural networks. Namely, we investigate the limitations in the expressive power of the well-known Relational GCN and Compositional GCN architectures and shed some light on their practical learning performance. By aligning both architectures with a suitable version of the Weisfeiler-Leman test, we establish under which conditions both models have the same expressive power in distinguishing non-isomorphic (multi-relational) graphs or vertices with different structural roles. Further, by leveraging recent progress in designing expressive graph neural networks, we introduce the $k$-RN architecture that provably overcomes the expressiveness limitations of the above two architectures. Empirically, we confirm our theoretical findings in a vertex classification setting over small and large multi-relational graphs.
translated by 谷歌翻译
Formulating and answering logical queries is a standard communication interface for knowledge graphs (KGs). Alleviating the notorious incompleteness of real-world KGs, neural methods achieved impressive results in link prediction and complex query answering tasks by learning representations of entities, relations, and queries. Still, most existing query answering methods rely on transductive entity embeddings and cannot generalize to KGs containing new entities without retraining the entity embeddings. In this work, we study the inductive query answering task where inference is performed on a graph containing new entities with queries over both seen and unseen entities. To this end, we devise two mechanisms leveraging inductive node and relational structure representations powered by graph neural networks (GNNs). Experimentally, we show that inductive models are able to perform logical reasoning at inference time over unseen nodes generalizing to graphs up to 500% larger than training ones. Exploring the efficiency--effectiveness trade-off, we find the inductive relational structure representation method generally achieves higher performance, while the inductive node representation method is able to answer complex queries in the inference-only regime without any training on queries and scales to graphs of millions of nodes. Code is available at https://github.com/DeepGraphLearning/InductiveQE.
translated by 谷歌翻译
基于1-HOP邻居之间的消息传递(MP)范式交换信息的图形神经网络(GNN),以在每一层构建节点表示。原则上,此类网络无法捕获在图形上学习给定任务的可能或必需的远程交互(LRI)。最近,人们对基于变压器的图的开发产生了越来越多的兴趣,这些方法可以考虑超出原始稀疏结构以外的完整节点连接,从而实现了LRI的建模。但是,仅依靠1跳消息传递的MP-gnn与位置特征表示形式结合使用时通常在几个现有的图形基准中表现得更好,因此,限制了Transferter类似体系结构的感知效用和排名。在这里,我们介绍了5个图形学习数据集的远程图基准(LRGB):Pascalvoc-SP,Coco-SP,PCQM-Contact,Peptides-Func和肽结构,可以说需要LRI推理以在给定的任务中实现强大的性能。我们基准测试基线GNN和Graph Transformer网络,以验证捕获长期依赖性的模型在这些任务上的性能明显更好。因此,这些数据集适用于旨在捕获LRI的MP-GNN和Graph Transformer架构的基准测试和探索。
translated by 谷歌翻译
我们提出了一个食谱,讲述了如何建立具有线性复杂性和最先进的结果的一般,功能可扩展的(GPS)图形变压器,并在各种基准测试基准上。 Graph Transformers(GTS)在图形表示学习领域中获得了多种近期出版物的知名度,但它们对构成良好的位置或结构编码的共同基础以及与众不同的区别。在本文中,我们总结了具有更清晰的定义的不同类型的编码,并将其分类为$ \ textit {local} $,$ \ textit {global} $或$ \ textit {fextit {ferseal} $。此外,GTS仍被限制在具有数百个节点的小图上,我们提出了第一个具有复杂性线性的体系结构对节点和边缘$ O(n+e)$的数量,通过将局部实质汇总从完全 - 连接的变压器。我们认为,这种解耦并不会对表现性产生负面影响,而我们的体系结构是图形的通用函数近似器。我们的GPS配方包括选择3种主要成分:(i)位置/结构编码,(ii)局部消息通讯机制和(iii)全局注意机制。我们构建和开源一个模块化框架$ \ textit {graphgps} $,该{GraphGps} $支持多种类型的编码,并且在小图和大图中提供效率和可扩展性。我们在11个基准测试上测试了我们的体系结构,并对所有这些基准显示出非常具竞争力的结果,展示了由模块化和不同策略组合获得的经验益处。
translated by 谷歌翻译
在知识图上回答复杂的一阶逻辑(FOL)查询是多跳推理的基本任务。传统的符号方法穿越完整的知识图来提取答案,从而为每个步骤提供良好的解释。最近的神经方法学习复杂查询的几何嵌入。这些方法可以推广到不完整的知识图,但是它们的推理过程很难解释。在本文中,我们提出了图形神经网络查询执行器(GNN-QE),这是一种神经符号模型,享有两全其美的优势。 GNN-QE将复杂的数据分解为模糊集的关系预测和逻辑操作,这为中间变量提供了解释性。为了理解丢失的链接,GNN-QE从知识图完成中调整了图神经网络以执行关系预测,并使用产品模糊逻辑对逻辑操作进行建模。 3个数据集的实验表明,GNN-QE在回答FOL查询时显着改善了先前的最新模型。同时,GNN-QE可以在没有明确监督的情况下预测答案的数量,并为中间变量提供可视化。
translated by 谷歌翻译
多跳跃逻辑推理是在知识图(KGS)上学习领域的一个已建立问题。它涵盖了单跳连接预测以及其他更复杂的逻辑查询类型。现有的算法仅在经典的三重基图上运行,而现代KG经常采用超相关的建模范式。在此范式中,键入的边缘可能具有几对键值对,称为限定符,可为事实提供细粒度的环境。在查询中,此上下文修改了关系的含义,通常会减少答案集。经常在现实世界中的应用程序中观察到超相关的查询,并且现有的近似查询答案方法无法使用预选赛对。在这项工作中,我们弥合了这一差距,并将多跳的推理问题扩展到了超级关系的KG,允许解决这一新类型的复杂查询。在图形神经网络和查询嵌入技术的最新进展之下,我们研究了如何嵌入和回答超相关的连词查询。除此之外,我们还提出了一种回答此类查询并在我们的实验中证明的方法,即预选赛可以改善对各种查询模式的查询回答。
translated by 谷歌翻译
最近公布的知识图形嵌入模型的实施,培训和评估的异质性已经公平和彻底的比较困难。为了评估先前公布的结果的再现性,我们在Pykeen软件包中重新实施和评估了21个交互模型。在这里,我们概述了哪些结果可以通过其报告的超参数再现,这只能以备用的超参数再现,并且无法再现,并且可以提供洞察力,以及为什么会有这种情况。然后,我们在四个数据集上进行了大规模的基准测试,其中数千个实验和24,804 GPU的计算时间。我们展示了最佳实践,每个模型的最佳配置以及可以通过先前发布的最佳配置进行改进的洞察。我们的结果强调了模型架构,训练方法,丢失功能和逆关系显式建模的组合对于模型的性能来说至关重要,而不仅由模型架构决定。我们提供了证据表明,在仔细配置时,若干架构可以获得对最先进的结果。我们制定了所有代码,实验配置,结果和分析,导致我们在https://github.com/pykeen/pykeen和https://github.com/pykeen/benchmarking中获得的解释
translated by 谷歌翻译
Neural networks have achieved impressive results on many technological and scientific tasks. Yet, their empirical successes have outpaced our fundamental understanding of their structure and function. By identifying mechanisms driving the successes of neural networks, we can provide principled approaches for improving neural network performance and develop simple and effective alternatives. In this work, we isolate the key mechanism driving feature learning in fully connected neural networks by connecting neural feature learning to the average gradient outer product. We subsequently leverage this mechanism to design \textit{Recursive Feature Machines} (RFMs), which are kernel machines that learn features. We show that RFMs (1) accurately capture features learned by deep fully connected neural networks, (2) close the gap between kernel machines and fully connected networks, and (3) surpass a broad spectrum of models including neural networks on tabular data. Furthermore, we demonstrate that RFMs shed light on recently observed deep learning phenomena such as grokking, lottery tickets, simplicity biases, and spurious features. We provide a Python implementation to make our method broadly accessible [\href{https://github.com/aradha/recursive_feature_machines}{GitHub}].
translated by 谷歌翻译
Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy. However, DNNs can be computationally intensive with billions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that's easy, light-weight and universal in text classification: a combination of a simple compressor like gzip with a $k$-nearest-neighbor classifier. Without any training, pre-training or fine-tuning, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distributed datasets. It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also performs particularly well in few-shot settings where labeled data are too scarce for DNNs to achieve a satisfying accuracy.
translated by 谷歌翻译
Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the effect of noisy evaluation in federated hyperparameter tuning. We first identify and rigorously explore key sources of noise, including client subsampling, data and systems heterogeneity, and data privacy. Surprisingly, our results indicate that even small amounts of noise can significantly impact tuning methods-reducing the performance of state-of-the-art approaches to that of naive baselines. To address noisy evaluation in such scenarios, we propose a simple and effective approach that leverages public proxy data to boost the evaluation signal. Our work establishes general challenges, baselines, and best practices for future work in federated hyperparameter tuning.
translated by 谷歌翻译